# Multilingual TTS
Tts 1.6b En Fr
The Kyoto Station Text-to-Speech (TTS) model is a model for streaming text-to-speech, supporting real-time speech generation and multilingual processing.
Speech Synthesis Supports Multiple Languages
T
kyutai
1,441
90
Spark TTS 0.5B GGUF
Spark-TTS-0.5B is a quantized version based on prince-canuma/Spark-TTS-0.5B, supporting text-to-speech tasks in English and Chinese.
Speech Synthesis Supports Multiple Languages
S
mradermacher
318
0
Llama OuteTTS 1.0 1B 3bit
This is a 3-bit quantized text-to-speech model in MLX format, supporting multiple languages.
Speech Synthesis Supports Multiple Languages
L
mlx-community
16
0
Indicf5
IndicF5 is a near-human-level multilingual text-to-speech (TTS) model supporting 11 Indian languages, trained on 1417 hours of high-quality speech data.
Speech Synthesis Other
I
shethjenil
46
0
CSM 1B
Apache-2.0
This is a text-to-speech model based on the Apache-2.0 license, supporting English language processing.
Speech Synthesis English
C
drbaph
43
2
Fish Speech 1.5
Fish Speech V1.5 is a leading text-to-speech (TTS) model, trained on over 1 million hours of multilingual audio data.
Speech Synthesis Supports Multiple Languages
F
ModelsLab
98
3
Zonos V0.1 Transformer
Apache-2.0
Zonos-v0.1 is a leading open-weight text-to-speech model trained on over 200,000 hours of multilingual speech data, delivering expressiveness and quality comparable to or even surpassing top-tier TTS service providers.
Speech Synthesis
Z
Isi99999
30
0
Outetts 0.3 1B GGUF
OuteTTS-0.3-1B is a multilingual text-to-speech model developed by OuteAI, supporting English, Chinese, Japanese, Korean, French, and German.
Speech Synthesis Supports Multiple Languages
O
gaianet
34
0
Outetts 0.3 1B GGUF
OuteTTS-0.3-1B is a multilingual text-to-speech model developed by OuteAI and quantized by Second State Inc.
Speech Synthesis Supports Multiple Languages
O
second-state
151
1
Outetts 0.3 500M GGUF
OuteTTS-0.3-500M is a multilingual text-to-speech model developed by OuteAI and released under the cc-by-nc-4.0 license.
Speech Synthesis Supports Multiple Languages
O
gaianet
79
0
Outetts 0.3 500M GGUF
OuteTTS-0.3-500M is a multilingual text-to-speech model supporting English, Chinese, Japanese, Korean, French, and German.
Speech Synthesis Supports Multiple Languages
O
second-state
49
1
Outetts 0.2 500M GGUF
OuteTTS-0.2-500M is a multilingual text-to-speech model developed by OuteAI, supporting English, Chinese, Japanese, and Korean.
Speech Synthesis Supports Multiple Languages
O
gaianet
44
0
Outetts 0.2 500M GGUF
OuteTTS-0.2-500M is a multilingual text-to-speech model supporting English, Chinese, Japanese, and Korean.
Speech Synthesis Supports Multiple Languages
O
second-state
693
0
Helpingai TTS V1
Apache-2.0
HelpingAI-TTS-v1 is a next-generation text-to-speech (TTS) tool focused on personalization, emotional expression, and clarity, supporting multiple languages and emotion customization.
Speech Synthesis
Transformers Supports Multiple Languages

H
HelpingAI
1,121
6
Fish Speech 1.5
Leading text-to-speech (TTS) model trained on over 1 million hours of multilingual audio data
Speech Synthesis Supports Multiple Languages
F
jkeisling
194
1
Fish Speech 1.5 Base
MIT
Fish Speech 1.5 is a multilingual text-to-speech model that supports multiple languages and can be used without an access token.
Speech Synthesis Supports Multiple Languages
F
None1145
111
4
Indri 0.1 124m Tts GGUF
Indri is a text-to-speech (TTS) model supporting English and Hindi, with a parameter size of 124M, optimized for CPU inference in GGUF format.
Speech Synthesis Supports Multiple Languages
I
11mlabs
86
0
Speecht5 Tts Tamil
MIT
A Tamil speech synthesis model fine-tuned on the common_voice_17_0 dataset based on microsoft/speecht5_tts
Speech Synthesis
Transformers

S
vasukumarp
30
1
Fish Agent V0.1 3b
A groundbreaking speech-to-speech model capable of accurately capturing and generating environmental audio information, while featuring advanced text-to-speech capabilities.
Speech Synthesis Supports Multiple Languages
F
fishaudio
653
259
Speect5 Common Voice Hindi
MIT
A Hindi speech synthesis model fine-tuned on the common_voice_17_0 dataset based on microsoft/speecht5_tts
Speech Synthesis
Transformers Other

S
Solo448
36
0
XTTS Hindi Finetuned
Other
This is a fine-tuned version of the XTTS v2 model developed by Coqui-AI, specifically optimized for Hindi speech datasets, supporting voice cloning and multilingual speech generation.
Speech Synthesis
X
Abhinay45
34
9
Speecht5 Tts Vie
MIT
This is a Vietnamese text-to-speech (TTS) model fine-tuned on the microsoft/speecht5_tts model, trained on the generator dataset.
Speech Synthesis
Transformers Other

S
Nyanmero
18
0
Vixtts
Other
viⓍTTS is a voice generation model supporting 18 languages, specifically optimized for Vietnamese, achieving cross-lingual voice cloning with just 6 seconds of audio.
Speech Synthesis
Transformers Other

V
capleaf
2,782
76
Speecht5 Finetuned Voxpopuli Ro
MIT
A text-to-speech model fine-tuned on the VoxPopuli dataset based on microsoft/speecht5_tts
Speech Synthesis
Transformers

S
mitro99
86
0
Speecht5 Tts Portuguese
MIT
A Portuguese text-to-speech model fine-tuned based on Microsoft's SpeechT5 architecture, supporting high-quality speech synthesis
Speech Synthesis
Transformers Other

S
flavioegoncalves
20
2
Speecht5 Finetuned Multilingual Librispeech De
MIT
A text-to-speech model fine-tuned on the German LibriSpeech dataset based on Microsoft's SpeechT5 model
Speech Synthesis
Transformers German

S
semaj83
14
0
Mms Tts Mya
Burmese text-to-speech model developed by Meta, part of the Massively Multilingual Speech (MMS) project
Speech Synthesis
Transformers

M
facebook
1,304
4
Mms Tts Cmo Script Khmer
A Central Mnong text-to-speech model developed by Meta, supporting conversion of text to natural speech
Speech Synthesis
Transformers

M
facebook
142
1
Mms Tts Kir
A Kyrgyz text-to-speech model developed by Meta, based on the VITS architecture, supporting high-quality speech synthesis.
Speech Synthesis
Transformers

M
facebook
149
4
Mms Tts Nya
Chichewa text-to-speech model developed by Meta AI, based on VITS architecture, supporting high-quality speech synthesis
Speech Synthesis
Transformers

M
facebook
23
0
Mms Tts Vie
Vietnamese text-to-speech model developed by Meta, based on the VITS architecture, supporting high-quality speech synthesis
Speech Synthesis
Transformers

M
facebook
3,616
27
Mms Tts Mal
Malayalam text-to-speech model in Facebook's MMS project, implementing end-to-end speech synthesis based on VITS architecture
Speech Synthesis
Transformers

M
facebook
307
2
Bark Small
MIT
Bark is a Transformer-based multilingual text-to-audio model developed by Suno, capable of generating realistic speech, music, and non-verbal sounds
Speech Synthesis
Transformers Supports Multiple Languages

B
suno
22.74k
201
Speecht5 Finetuned Google Fleurs Greek
MIT
Greek text-to-speech model fine-tuned based on microsoft/speecht5_tts
Speech Synthesis
Transformers

S
Sandiago21
17
2
Speecht5 Finetuned Common Voice 13 0 Euskera
MIT
A text-to-speech model fine-tuned on the Common Voice 13.0 Basque dataset based on Microsoft's SpeechT5 architecture
Speech Synthesis
Transformers

S
dvinagre
29
0
Speecht5 Tts Finetuned Voxpopuli Sk V2
MIT
A text-to-speech model fine-tuned on the Slovak VoxPopuli dataset based on Microsoft's SpeechT5 architecture
Speech Synthesis
Transformers

S
vineetsharma
35
1
Mms Tts Kor
Korean text-to-speech model from Meta's Massively Multilingual Speech project, supporting natural speech conversion from Korean text
Speech Synthesis
Transformers

M
Matthijs
29
2
Speecht5 Tts Common Voice Zh
MIT
Dutch text-to-speech model fine-tuned based on microsoft/speecht5_tts
Speech Synthesis
Transformers Chinese

S
wuula
65
6
Speecht5 Tts Commonvoice Ca
MIT
Catalan text-to-speech model based on the SpeechT5 architecture, fine-tuned on the Common Voice 11.0 dataset
Speech Synthesis
Transformers Other

S
wetdog
22
0
Featured Recommended AI Models